17 research outputs found

    A Formal Account of the Open Provenance Model

    Get PDF
    On the Web, where resources such as documents and data are published, shared, transformed, and republished, provenance is a crucial piece of metadata that would allow users to place their trust in the resources they access. The Open Provenance Model (OPM) is a community data model for provenance that is designed to facilitate the meaningful interchange of provenance information between systems. Underpinning OPM is a notion of directed graph, where nodes represent data products and processes involved in past computations, and edges represent dependencies between them; it is complemented by graphical inference rules allowing new dependencies to be derived. Until now, however, the OPM model was a purely syntactical endeavor. The present paper extends OPM graphs with an explicit distinction between precise and imprecise edges. Then a formal semantics for the thus enriched OPM graphs is proposed, by viewing OPM graphs as temporal theories on the temporal events represented in the graph. The original OPM inference rules are scrutinized in view of the semantics and found to be sound but incomplete. An extended set of graphical rules is provided and proved to be complete for inference. The paper concludes with applications of the formal semantics to inferencing in OPM graphs, operators on OPM graphs, and a formal notion of refinement among OPM graphs

    The Foundations of the Open Provenance Model

    No full text
    The Open Provenance Model (OPM) is a community-driven data model for Provenance that is designed to support inter-operability of provenance technology. Underpinning OPM, is a notion of directed acyclic graph, used to represent data products and processes involved in past computations, and causal dependencies between these. The Open Provenance Model was derived following two "Provenance Challenges", international, multi-disciplinary activities trying to investigate how to exchange information between multiple systems supporting provenance and how to query it. The OPM design was mostly driven by practical and pragmatic considerations, and is being tested in a third Provenance Challenge, which has just started. The purpose of this paper is to investigate the theoretical foundations of this data model. The formalisation consists of a set-theoretic definition of the data model, a definition of the inferences by transitive closure that are permitted, a formal description of how the model can be used to express dependencies in past computations, and finally, a description of the kind of time-based inferences that are supported. A novel element that OPM introduces is the concept of an account, by which multiple descriptions of a same execution are allowed to co-exist in a same graph. Our formalisation gives a precise meaning to such accounts and associated notions of alternate and refinement

    Petri Net + Nested Relational Calculus =

    No full text
    Abstract. In this paper we propose a formal, graphical workflow language for dataflows, i.e., workflows where large amounts of complex data are manipulated and the structure of the manipulated data is reflected in the structure of the workflow. It is a common extension of – Petri nets, which are responsible for the organization of the processing tasks, and – Nested relational calculus, which is a database query language over complex objects, and is responsible for handling collections of data items (in particular, for iteration) and for the typing system. We demonstrate that dataflows constructed in hierarchical manner, according to a set of refinement rules we propose, are sound: initiated with a single token (which may represent a complex scientific data collection) in the input node, terminate with a single token in the output node (which represents the output data collection). In particular they always process all of the input data, leave no ”debris data ” behind and the output is always eventually computed.
    corecore